Diagnosis of Diabetes Using a Random Forest Algorithm

نویسندگان

  • Kozegar, Ehsan Computer Engineering, Faculty of Engineering, University of Guilan, Iran
  • Ravaei, Bahman ComputerEngineering, Faculty of Engineering, Yasouj University, Iran
چکیده مقاله:

Background: Diabetes is the fourth leading cause of death in the world. And because so many people around the world have the disease, or are at risk for it, diabetes can be called the disease of the century. Diabetes has devastating effects on the health of people in the community and if diagnosed late, it can cause irreparable damage to vision, kidneys, heart, arteries and so on. Therefore, it is necessary to have methods to diagnose this disease in the early stages. In this article, data mining is used to diagnose diabetes. Methods: The main algorithm used in this paper is the random forest algorithm. To evaluate the efficiency of the proposed algorithm in diagnosing diabetes, a data set was used that included 768 samples (patients) and had 8 characteristics. Because the stochastic forest algorithm is a hybrid algorithm created from several decision trees, it achieves high accuracy in diagnosing diabetes. Results: Using this algorithm, we were able to increase the accuracy of diabetes diagnosis to 99.86%. Conclusion: Diabetes is the fourth leading cause of death in the world. Different algorithms have been used to diagnose this disease. We tried to use an algorithm that has a very high degree of accuracy compared to other algorithms for diagnosing this disease.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

Prediction and Diagnosis of Diabetes Mellitus using a Water Wave Optimization Algorithm

Data mining is an appropriate way to discover information and hidden patterns in large amounts of data, where the hidden patterns cannot be easily discovered in normal ways. One of the most interesting applications of data mining is the discovery of diseases and disease patterns through investigating patients' records. Early diagnosis of diabetes can reduce the effects of this devastating disea...

متن کامل

A Random Forest Turbulence Prediction Algorithm

Unlike traditional pilot reports, in-situ EDR reports of atmospheric turbulence from commercial aircraft contain both positive and negative instances, are reported regularly, and have relatively accurate positions and timestamps. These data therefore make it feasible to perform more sophisticated analyses of the causes of atmospheric turbulence than were formerly possible. Several real-time gri...

متن کامل

Prediction of PKCθ Inhibitory Activity Using the Random Forest Algorithm

This work is devoted to the prediction of a series of 208 structurally diverse PKCθ inhibitors using the Random Forest (RF) based on the Mold(2) molecular descriptors. The RF model was established and identified as a robust predictor of the experimental pIC(50) values, producing good external R(2) (pred) of 0.72, a standard error of prediction (SEP) of 0.45, for an external prediction set of 51...

متن کامل

Classification of genome data using Random Forest Algorithm: Review

Random Forest is a popular machine learning tool for classification of large datasets. The Dataset classified with Random Forest Algorithm (RF) are correlated and the interaction between the features leads to the study of genome interaction. The review is about RF with respect to its variable selection property which reduces the large datasets into relevant samples and predicting the accuracy f...

متن کامل

Prognosis of multiple sclerosis disease using data mining approaches random forest and support vector machine based on genetic algorithm

Background: Multiple sclerosis (MS) is a degenerative inflammatory disease which is most commonly diagnosed by magnetic resonance imaging (MRI). But, since the MRI device uses of a magnetic field, if there are metal objects in the patient's body, it can disrupt the health of the patient, the functioning of the MRI, and distortion in the images. Due to limitations of using MRI device, screening ...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 21  شماره 2

صفحات  92- 100

تاریخ انتشار 2021-07

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

کلمات کلیدی برای این مقاله ارائه نشده است

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023